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■ Abstract. The classical approach to multivariate extreme value modelling assumes that the joint 
distribution belongs to a multivariate domain of attraction. This requires each marginal distribution 

_ be individually attracted to a univariate extreme value distribution. An apparently more flexible 

■ extremal model for multivariate data was proposed by Hefl'ernan and Tawn under which not all the 
components are required to belong to an extremal domain of attraction but assumes instead the 
existence of an asymptotic approximation to the conditional distribution of the random vector given 

\^ • one of the components is extreme. Combined with the knowledge that the conditioning component 

belongs to a univariate domain of attraction, this leads to an approximation of the probability of 
certain risk regions. The original focus on conditional distributions had technical drawbacks but is 
natural in several contexts. We place this approach in the context of the more general approach 
using convergence of measures and multivariate regular variation on cones. 
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! 1. Overview 
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The classical approach to extreme value modelling for multivariate data is to assume that the 
joint distribution belongs to a multivariate domain of attraction. In particular, this requires that 
each marginal distribution be individually attracted to a univariate extreme value distribution. 
The domain of attraction condition may be phrased conveniently in terms of regular variation of 
\^ , the joint distribution on an appropriate cone; see Das and Resnick [5, Proposition 4.1]. 

■ A more flexible model for data realizations of a random vector was proposed by Heffernan and 

Tawn [11], under which not all the components are required to belong to an extremal domain of 
O I attraction. Such a model accomodates varying degrees of asymptotic dependence between pairs of 
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components. Instead of starting from the joint distribution, Heffernan and Tawn assumed the exis- 
tence of an asymptotic approximation to the conditional distribution of the random vector given one 
of the components was extreme. Combined with the knowledge that the conditioning component 
belongs to a univariate domain of attraction, this leads to an approximation for the probabilities 
of certain multivariate risk sets. However, focusing on conditional distributions creates problems 
■ when taking limits owing to ambiguity regarding the choice of version. So the Heffernan/Tawn 

approach was reformulated as the conditional extreme value model (CEVM) in [5, 6, 10] using 
regular variation of the joint distributions on a smaller cone than the one employed in multivariate 
extreme value theory, an approach related to hidden regular variation [12, 13, 15, 17]. 

Conditional distributions are natural objects in many circumstances, for example if densities 
exist or if one variable is an explicit function of others. So we return to the Heffernan and Tawn 
[11] formulation, placing it in a formal context that uses the idea of transition kernels in a domain of 
attraction developed in [18, 19]. We see how reliance on transition kernels fits with general theory 
expressed in terms of vague convergence of measures and to what extent the reliance on kernels 
restricts the class of limit measures. 

In order to better fit in with the study of extremes of a random vector, we extend the kernel 
domain of attraction condition used in [18] beyond standardized regular variation to accomodate 
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general linear normalization in both the initial state and the distribution of the next state. We 
examine conditions under which this extends to a CEVM, when combined with a marginal domain 
of attraction assumption, and we derive explicit formulas for the CEV limit measure in different 
cases. Also, through a number of revealing examples, we explore the properties of the normalization 
functions, and technicalities surrounding the choice of version of the conditional distribution and 
the limit distribution G. 

Section 2 summarizes necessary background, definitions and basic results including when a ran- 
dom vector (X, y) satisfies the conditional extreme value model (CEVM). Section 3 treats the 
standard case where both X and Y can be scaled by the same function and this restriction is 
weakened in Sections 4 and 5. 

We denote by M+(E) the space of Radon measures on a nice space E topologized by vague 
convergence which is written as A [16]. Weak convergence [2] of probability measures is denoted 
=^. We write ^ ~ G to mean a random variable ^ has distribution G(-) and, if no confusion can 
arise, we often use G to also mean the distribution function P[^ < x] = G{x). Alternatively for a 
random variable Y , we write Fy for the distribution of Y . The class of regularly varying functions 
with index p on (0,oo) is RVp [3, 7, 8, 16]. The probability measure degenerate at c G M is ec(-). 



2. Background 

First, we review the basics of extended regular variation, which features prominently in the 
formulation of the CEVM, as well as concepts of univariate extreme value theory. We then define 
the conditional extreme value model and discuss its basic properties. 

2.1. Extended Regular Variation. Regular variation and extended regular variation is impor- 
tant in the mathematical description of extreme and conditional extreme value theory [3, 7, 15, 16, 
21]. The pair of functions a : (0,oo) i— )• (0,oo) and / : (0,oo) i— )• M are extended regularly varying 
(ERV) with parameters p, /c G M if as t — )• oo, 

aitx) „ , f{tx) — f(t) , , , 

(2.1 ^-^^xP and ■'^ \ , -^i^ix), x > 0, 

^ ' a{t) a{t) ^ ' 

[7, Appendix B.2], where 

'kp-^{xP-l) 
k log X p = 



(2.2) i;{x) 



We will write this as a, / G ERVp^^ with a G RVp. A useful identity is 
(2.3) 4>{x^^) = -x-Pil^ix). 

Note that this differs slightly from the usual definition of extended regular variation, which assumes 
k = 1. If (j){x) := limt_j.oo(/(te) — f{t))/a{t) exists for x > 0, then a is necessarily regularly varying, 
and cj) = Tp, the function given in (2.2). Also, the convergences in (2.1) are locally uniform, implying 
that 

o-{txt) ^ p , f{txt)-f{t) . 

— > x^ and > yj[x) whenever xt — )• x > U. 

a{t) a{t) 

Furthermore, if 7^ we obtain the following properties depending on the value of p. Recall the 
sign function sgn(ii) = u/\u\ l{u^o}- 

• If p > 0, then / • sgn(A:) G RVp, and f{t)/a{t) k/p. 

• If yO < 0, then /(oo) = lim^^oo /(O exists finite, (/(oo) — f(t))/a{t) — )• k/\p\ and (/(oo) — 
/)-sgn(A:) GRV„|p|. 
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• If /O = 0, i.e., a is slowly varying, then / G n(a), the 11- varying functions with aux- 
iliary function a(-) [7, Appendix B.2]). Suppose k > 0. Then f{oo) < oo exists. If 
/(oo) = oo, then / G RVq and f{t)/a{t) — )• oo. If /(oo) < oo, then /(oo) — / G RVq, and 
(/(oo) — f(t))/a{t) — oo. If A; < 0, then — / has these properties. 

2.2. Domains of Attraction. For 7 G M, define = {x G M : 1 + 7X > 0} so that 

(—7""'-, 00) 7 > 



(2.4) E, 



'7 



—00, 00) 7 = 0. 
—00, |7|~"^) 7 < 

The distribution Fy of a random variable Y is in the domain of attraction of an extreme value 
distribution G-y for some 7 G M, written Fy G D{G.y), if there exist functions a{t) > and b{t) G M 
such that 

Fi.{a{t)y + b{t)) ^G^{y) 

weakly as t — >• 00, where Gy{y) = exp{ — (1 + 7y)~^^'^} for y G E^ [7, 16]. This can be reformulated 
in terms of the tail of the distribution Fy as 

(2.5) tP[Y- b{t)/a{t) >y]^{l + 7y)-i/7, y ^ e^. 

If 7 = 0, we interpret the limit as e~^. 

If (2.5) holds for some functions a and b, then it holds for [7, Theorem 1.1.6, p. 10]) 



(2.6) Ht) 



where is the left-continuous inverse of the nondecreasing function g. By inversion, (2.5) yields 

Ktx) - b{t) - 1 

(2.7) ^ — ^ l|^^o} + log X 1(^=0} , 

i.e., a,b £ ERV^^i. Furthermore, if functions a > and 6 G M on (0, 00) are asymptotically equivalent 
to a,b, i.e., they satisfy 

— ^ — > 1 and , , — > as t 00, 

a{t) a{t) 

then (2.5) and (2.7) hold with a, b replaced by a, b. It follows that (2.5) is equivalent to t P[b^{Y) > 
ty] — y^^ for y > 0, i.e., 1 — -Ffe^(y) G RV_i. This is known as standardization (see [16, Chapter 
5]). We say that Y* is in the standardized domain of attraction, and write Fy* G D{Gi), if 

(2.8) tP[Y>ty]^y-\ y > 0, 
a variant of (2.5) for 7=1. 

2.3. The Conditional Extreme Value (CEV) Model. Denote by E^ the closure on the right 
of the interval E^. A bivariate random vector {X,Y) on M? follows a conditional extreme value 
model (CEYM) if there exists a measure /i G M_|_([— 00, 00] x E^), and functions a(t),a{t) > 0, 
b{t),(3{t) G M, such that, as t — 00, 

'X-/3{t) Y-b{ty 



(2.9) tP 



in M+([-oo,oo] X E^), 



a{t) ' a{t) 

and where satisfies the conditional non- degeneracy conditions: for each y G E-,,, 

n{\—oo,x\ X (w,oo]) is not a degenerate distribution in x\ 
(2.10) , , \ ^ ' ^' 

^i({oo} X (y,oo]) =0. 
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It is convenient to choose the normahzation such that 

(2.11) H{x) := /i([— oo,x] X (0,oo]) is a probability distribution on [—00,00]. 

See [5, 10] for details and [11] for background. 

Some remarks: By applying the joint convergence (2.9) to rectangles [—00,00] x {y,oo], we see 
that the distribution of Y is necessarily attracted to for some 7. Also, an important property 
is that the functions a,/? are ERV for some p,k €M. [10, Proposition 1]. The limit measure in 
(2.9) is a product measure if and only if {p,k) = (0,0) [10, Proposition 2]. 

Condition (2.10) is somewhat different from what is given in [5, 10] which failed to preclude mass 
on the line {00} x (—00,00] through infinity. Mass on this line invalidates the convergence to types 
theorem [8, 14] and since the theory in [5, 10] employs convergence of types arguments, we require 
the second condition in (2.10). Condition (2.9) entails Y £ D{G^) and /i([— oo,x] x {00}) = 0. 
Example 3.6 is a case where (2.9) holds for two distinct normalizations, which are not asymptotically 
equivalent, yielding two distince limit measures. One limit measures has /i({oo} x (?/, 00]) > and 
the other has /i({oo} x (y,oo]) = 0. 

3. Standard Case 

Let {X,Y) be a random vector on M^, with dependence specified by a transition kernel K: 

P[X eA\Y = y] = K{y,-) y G R. 

K{y, A) is a measure in the second variable A and measurable in y for each fixed A. We show if the 
distribution of Y is in an extremal domain of attraction, and K belongs to the domain of attraction 
of a probability distribution G (a notion to be defined precisely), then {X,Y) follows a CEVM. 
We begin with the standard case which means that {X,Y) G [0, 00)^, and Fy G D{G\), 

(3.1) tFYit-) in M+ (0,00] as t ^ 00, 

where z^i(x,oo] = x > (a formulation equivalent to (2.8)) and K G D{G) meaning 

(3.2) K{t, t-) G(-) on [0,oo]. 

In what follows, ^ will always be a random variable with distribution G. 

3.1. Standard CEVM Properties. Conditions (3.1) and (3.2) imply {X,Y) follows a CEVM, 
provided G ^ eo, i.e., unit mass at {0}. 

Theorem 3.1. Suppose that the joint distribution of the random vector {X,Y) on [0,oo)^ satisfies 
(3.1) and (3.2), where G is a probability distribution on [0, 00). Then 

(3.3) tP[{X,Y) £t- ] ^ in M+([0,oo] X (0,00]), 
with limit measure /i given for x, y > 0, ^ ~ G by 

(3.4) M[0,x] X (y,oo]) = / G{x/u)ui{du) = - G(n)d7x = y'^ P[e < -] - x'^ E ^ 1{^<..M . 

jy ^ Jo y 

Furthermore, satisfies the conditional non- degeneracy conditions (2.10) provided G ^ eo. 

Proof. The convergence (3.3) is special case of Proposition 5.1 of [18]; it is an elaboration of the 
continuous mapping theorem. From (3.4), ;u([0,x] x (y, 00]) is continuous in x, and not constant 
provided G 7^ eo- Also, since ^((x, 00] x (y, 00]) = /^^ vi{du) P[^ > xn~^], that /.f({oo} x (y, 00]) = 
follows from the fact that G({oo}) = 0. Therefore, p satisfies (2.10). □ 
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3.1.1. Properties of the limit measure fi. Prom (3.4) we see ^ is continuous in x and y and if G has 
a density, then so does Continuity in (3.4) holds even if G is degenerate, i.e., G = ec for some 
c > 0; see Example 3.4 (p. 6). Non-degeneracy of G only becomes relevant in the non-standard 
case. Moreover, /i cannot be a product measure [5, Lemma 3.1]. 

From (3.4) we also observe that the y-axis through the origin is assigned mass proportional to 
G({0}) since /^({O} x (y, oo]) = y"^G{{0}). Mass on vertical slices of space depends on E^, since 
/i((x,oo] X (0, oo]) = x^^ < oo. In terms of conditional distributions, (3.3) implies 

(3.5) P\X <tx\Y >t]^ H(x) := n(\0,x]x (l,oo]) = - [ G{u)du. 

X Jo 

3.1.2. Extending to a larger cone. Convergence (3.3) extends to standard regular variation on the 
larger cone [0,oo]2\{0}, so that the distribution of {X,Y) is in a bivariate domain of attraction, if 
and only if Fx G D{Gi) as well [5, Proposition 4.1]. In this case, 

(3.6) t P [t^HX, Y) G [0, (x, y)r] ^ ^ (l + fj' G{u)dv^ , 

implying that E,^ < 1, and the x-axis receives mass according to /u((x,oo] x {0}) = x~^{l — E,^). 

3.1.3. Degenerate G; asymptotic independence. If G = eo, then the convergence (3.3) holds with 
limit measure //([0,a:] x (y, oo]) = y"^ but conditional non-degeneracy (2.10) fails, since all the mass 
lies on the y-axis, so {X, Y) does not follow a standard CEVM. This is in fact a manifestation of 
asymptotic independence. Indeed, 

P[X >tx\Y >t]^0 

for any x, so, given that Y is extreme (exceeding the threshold u{t) = t), it is very unlikely to 
observe X to be similarly extreme. If the joint distribution of {X, Y) is regularly varying on the 
larger cone [0, oo]^\{0}, then 

t P [t-\X, Y) G [0, (x, y)r] x-i + y-\ 

which means that X and Y are asymptotically independent in the usual sense [10, Section 5]. In 
this case, {X, Y) does not follow a standard CEVM because of degeneracy, although a CEVM may 
hold if X is normalized differently; see Section 4. 

This suggests viewing the parameter G{{0}) as a measure of asymptotic dependence from Y to 
X. Por example, given Y, we could write X as a mixture 

(3.7) X = WXo + (1 - W)Xi, 

where Xq and Y are asymptotically independent, Xi and Y are asymptotically dependent, and W ~ 
Bernoulli(G({0})). This is suggested by the canonical form of the update function representation 
of G D{G) [18, Section 2.3]. Asymptotic dependence in the reverse direction, given large X, 
would then be quantified by 1 — E^ if appropriate. The latter phenomenon is hinted at by Segers 
[20] in his definition of the "back-and-forth tail chain" to approximate stationary Markov chains . 

3.2. Examples. Examples illuminate properties of the CEVM based on Markov kernels as in (3.2). 
Pirst, as in [5, Example 8], given any distribution G on [0, oo), we construct a CEVM whose limit 
measure fi is built on G as in (3.4). 

Example 3.1. Take G to be any probability distribution on [0, oo) not concentrating at 0. Let 
Y ~ Pareto(l) on [l,oo), ^ ~ G, independent of Y, and put X = ^Y. A version of the conditional 
distribution is 

K{y ,.) = P[Xe-\Y = y] = P{CY e-\Y = y] = G{y-^-), 
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and K satisfies (3.2) and in fact and K{t, t-) = G{-). Consequently, {X,Y) follows a standard 
CEVM with limit measure as in (3.4). In fact, for x,y > 0, we have 



P[X <x,Y > y]= I K{u , [0, x\)P[Y G du] 

J{y,oo] 

i-oo 1 t-xA- 

= P[^< xu-^]u-'^du = - G{u)du. 
Jvyi ^ Jo 



/j/Vl 

Furthermore, {X,Y) belong to a standard bivariate domain of attraction (3.6) iff Fx G D{Gi) as 
well. The marginal distribution of X = S^Y is 

Fx(x) = - [ G(u)du = H(x), 
X Jo 

(from (3.5)) which has density fx{x) = x~^{G{x) — H{x)) for x > 0. Since 

1 Z"*^' 

lim t P[X > tx] = lim - / > u]du = x~^E£, (< oo), 

!>oo t— !>oo X Jq 

{X,Y) belongs to the standard domain of attraction iff E,^ = 1. □ 

Using the Example 3.1 recipe, we explore the CEVM in a variety of special cases. 

Example 3.2. Choose ^ ~ Exp(A) and we have X = X~^YE, where E ~ Exp(l). The limit 
measure is 

1 r^^, , 1 1 e-^^^/y 



1 X 1 1 

M[0,x]x(2/,oo]) = - / {l-e-^^)du = -- — + 
X Jq y ax 



Xx 

and the marginal distribution of X is Fx{x) = 1 — {Xx)~^ {1 — e"''^^) with density f{x) = X~^x~'^{l — 
g_Ax) - x-^e"^^, and Fx satisfies (3.1) iff A = 1. □ 

Next, we suppose ^ is heavy-tailed. 

Example 3.3. For a > let ^ ~ Pareto(a) so 1 — G{x) =: G{x) = x > 1. The limit measure 
assigns no mass to {{x,y) : < x < y}, and for x > y > 0, 



— I ; 7T a > 1 



H{[0,x] X {y,oo]) = < 



y \a — 1 J X x"(a — 1) 

1 1 log X logy 

1 a = I 

y X X X 

1 f2-a\ 1 1 

- + - H -, r a < 1. 

y \l - a) X x"yi-°(l - a) 



When a < 1, E,^ = oo and /i((x, oo] x (y, oo]) = y~^ — /i([0, x] x (y, oo]) — )• oo as y J, 0. □ 

It is also possible that G is discrete, although the CEVM limit measure ^ remains continuous. 

Example 3.4. Suppose ^ has discrete distribution P[,^ = k] = a^, /c = 0, 1, . . . . In this case, the 
limit measure is given by 

/i([0,x] X (y,oo]) = - / \YliO'kjdu=^ ak{y~^ - /cx"^), 

^ k=0 k=0 

which is continuous in x and y, and Fx(x) = Yl'k=o ^l^i^ ~ kx~^). In particular, if P[,^ = c] = 1 for 
some c > 0, we obtain 

^i([0,x] X (y,oo]) = (y"^ - cx^^) l{x.>cj/>o} • 
The conditional non-degeneracy conditions (2.10) are satisfied even though G is degenerate. □ 
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The final example shows how G reflects asymptotic independence between X and Y . 

Example 3.5. Consider Y ~ Pareto(l), and Z independent of Y such that P[Z < oo] = 1. Take 
X = Y\/ Z. Given Y is extreme, it is unlikely that Z is as extreme as Y since they are independent. 
We have 

K{y , [0, x]) = P[Y <x,Z <x\Y = y] = P[Z < x] l|,>y}, 

and so 

K{t , t[0, x]) = P[Z < tx] l|,>i} l|,>i| = ei([0, x]) = G([0, x]). 
As in the previous example, the limit measure is 

^([0, x] X {y, oo]) = (y"^ - x"^) l{cc>y>o} ■ 

On the other hand, if X' = Y A Z then when Y is large, it is likely X' = Z, so X' should be 
asymptotically independent of Y. More precisely, 

K{y , (x, cx,]) = P[Y>x,Z>x\Y = y] = P[Z > x] l|j,>,}, 

from which 

K{t , t{x, oo]) = P[Z > tx] l{a;<i} — > 
for X > 0. Therefore, G = eQ, and the conditional non-degeneracy conditions fail. □ 

3.3. Counter-examples. As expected, the converse to Theorem 3.1 can fail. If {X,Y) follows 
a non-degenerate CEVM as in (3.3), and K is a specific version of the conditional distribution 
P[X G • I y = y], it does not necessarily follow that there exists a distribution G such that (3.2) 
holds. The failure of (3.2) can happen in two ways. There may exist a probability distribution 
G on [0, oo] satisfying (3.2) with G({oo}) > or it may be possible to obtain two distinct limit 
distributions down different subsequences and 

Example 3.6 where G({oo}) > emphasizes the importance of assuming ;u({oo} x (y,oo]) = 0. 

Example 3.6. As usual, take Y ~ Pareto(l) and suppose that 

X = WY + {1 - W)Y^, 

where W ~ Bernoulli(p) independent of Y. Then 

K{y, •) = P[X £-\Y = y]=pey + {l-p)ey2, 

so 

K{t , t •) = pei + (1 - p)et pei + (1 - p)eoc = G on [0, oo]. 

Indeed, for < x < oo, 

K{t , t[0,x]) = pei{[0,x]) + {I - p)et{[0,x]) -^pei{[0,x]) 

showing that G({oo}) = 1 — p. 
On the other hand, for x, y > 0, 

P[X < x,Y > y] = pP[Y < X, Y > y] + {1 - p) P[Y^ <x,Y >y] 



p 



1 



(yvi) X, 



l{a;>{j/Vl)} +(1 -P) 



1 



(yvi) 



SO for t sufficiently large, 

tP[X <tx,Y > ty] =p 

(3.8) 



1 1 

y X 
1 1 

y X 



^{x>y} 



il-p) 



Vt 



l-{a;>{j/Vl)2}) 



P 



l{x>s/} = fJ'{[0,x] X (y,oo]). 
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1 -P _^ P_ 



The measure /i assigns positive mass to {00} x (y, 00] since 

00] X (y, 00]) = y-^ - /i([0, x] x (y, 00]) = - l{^.<y} + 

and thus /i({oo} x (y,oo]) = (1 —p)y^^. Therefore, // does not satisfy (2.10). 
Under a different normahzation, we obtain a proper hmit G. Indeed, note that 

K{t , t^-) = pe^-i + (1 — p)ei =^ peo + (1 — p)ei ~ Bernouni(l — p), 

and hence, 

t P[X < t^x, Y>t-y] =p(y"i - (txy^) l{,^>y/t} +(1 - p)iy-^ - x-^^^) 1|,>^| 

py-^ + (1 - p)(y-i - . 
This hmit does satisfy (2.10). □ 

Without condition (2.10), the convergence of types theorem fails and it is possible to obtain 
different CEV limits under different normalizations. From (3.4), ^({00} x (y,oo]) = G({oo})y~^ 
and excluding defective distributions in Theorem 3.1 avoids cases like the previous one. 

Here is an example of a CEVM where the normalized kernel K does not have a unique limit. 

Example 3.7. Suppose Y ~ Pareto(l), and define X by 

X = WY + {1- W)2Y l|ye[o,oo)\N} 

where W ~ Bernoulli(p) independent of Y . In other words, given Y = y^ X takes the value y or 2y 
according to a coin flip, unless y is an integer, in which case X will be either y or 0. The CEVM 
holds for {X,Y). Since P[y G N] = 0, we have 

P[X <x,Y >y] = P[X <x,Y >y,Y e [0, oo)\N] 

= p P{Y <x,Y > y) + {l-p) P{2Y <x,Y >y) 

and tP[X < tx,Y > ty] = P\X < x^Y > y], which satisfies (2.11) and the requirement that 
/i((-) X (y,oo]) not be degenerate for any y. However, the conditional distribution of X given Y is 



K{y. 



pey + (1 - p)eo y G N 

pey + {l-p)e2y yG[0,oo)\N 



so 

fpei + (l-p)eo tGN 



K{t,t- 



pei + (l-p)e2 tG[0,oo)\N 



We obtain different limits along the sequences tn = n and = n/2 and K{t, t •) does not converge. 

□ 

The technical difficulty highlighted in Example 3.7 is that conditional distributions of the form 
P[X E • I y = y] are only specified up to sets of P[y G • ]-measure zero. If Y is absolutely continuous, 
we can alter the conditional probability for a countable number of y without affecting the joint 
distribution. Consequently, constructing a convergence theory based on conditional distributions 
requires care. The best one can do is fix a version of the kernel or, if circumstances allow, choose a 
version of the kernel with some claim to naturalness based on smoothness. This is the reason the 
approach in [5, 10] is based on vague convergence of measures rather than convergence of conditional 
distributions as in [11]. 



(4.2) tP 



< x,Y >ty 



t P[Y e tdu]K{tu , [-00, a{t)x + /3(t)]) . 
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4. General Normalization for X 

The CEVM allows different normalizations for X and Y, as in (2.9), but the formulation K E 
D{G) in (3.2), imposes the same normalization for both. We now allow general linear normalizations 
of X in the kernel condition, continuing to assume condition (3.1) that Fy G D[G\). 

We will assume the following generalization of (3.2): there exist scaling and centering functions 
a{t) > 0, pit) £ M and a non-degenerate probability distribution G on [— oo,c)o), such that 

(4.1) K{t , [-co, a{t)x + /3{t)]) G{[- 00, x]) on [-00,00]. 

4.1. CEVM Properties. Consider the decomposition 

' x-m 

a{t) 

By a variant of the continuous mapping theorem ([18, Lemma 8.2], the integrals converge provided 
K{tu{t) , [— 00, Q(t)a; + /3(t)]) — t- (px{u) whenever u(t) — ?• n > 0. Proposition 4.1 discusses when this 
happens. 

Given p,k £M, define the generalized tail kernel associated with a distribution G on [—00,00] 
as the transition function kg ■ (0, 00) x B[—oo, 00] — t- [0, 1] given by 

(4.3) KGiy,A) = G{y-nA-^iy)]), 

where "0 is specified in (2.2) (p. 2). Note that kg describes transitions between two different spaces. 
Since ip satisfies ip{uy) = uPtjj{y) + ip{u), a kernel k has the form (4.3) iff 

(4.4) K{uy,A) = K{y,u-P[A-'ilj{u)]). 

Proposition 4.1. Let K : (0, 00) x i3[— 00, 00] — )■ [0,1] be a transition function satisfying (4.1) with 
G is non- degenerate. There exists a family of non- degenerate probability distributions {Gu '■ < 
u < 00} on [—00, 00) such that for < u < 00, 

(4.5) K(^tu , [—00, a {t)x + (3 {t)\j ^ Gu{[— 00, x]) on [—00,00], 

as t —7- 00 if and only if a, (3 £ ERVp^^ as in (2.1) (p. 2). In this case, Gi = G, and 

(4.6) K{tut, [— 00, a{t)x + I3{t)\) =^ kg{u , [— oo,x]) on [—00,00] 

whenever ut = u{t) u £ (0, 00); i.e., the limit is a transition function of the form (4.3), where 
p,k are the ERV parameters ofa,/3. 

Proof. Assume first that a, [3 £ ERVp^fc and define 

a{tu) P{tu) - I3{t) 

htiy; u) = —rr^y + 7TT , 

a[t) a(t) 

so that by (2.1), ht{yt] u) — )• h{y; u) = u^y + Tp{u) whenever yt ^ y £ For n > 0, 

K{tu, [-00, a{t)x + I3{t)]) = K{tu , a{tu){h^^{- ■,u)[-oo,x]} + f3{tu)) . 

Applying the second continuous mapping theorem ([1], [18, Lemma 8.1]) to (4.1), we have 

K{tu, [-00, a{t)x + (3 (t)]) (Go/i-i(.;n))([-oo,x]) = G{[-oo,{x - ^P{u))/uP]). 

Hence, (4.5) holds with (?«(•) = kg{u,-) and Gi = G. Furthermore, we have ht{xt\ut) — )• h{x;u) 
whenever ut — )• u > 0, establishing (4.6). 

For the converse, we employ convergence of types. Denote by Ht{-) the distribution Kit,-). 
Then, on the one hand, we have Ht{[— 00, a{t)x + /3(t)]) =^ Gi([— 00, x\). On the other hand, fixing 
c > 0, we have 

Ht{a{tc)x + I3{tc)) = K[(tc)c~'^, [-00, a{tc)x + /3{tc)]) ^ G^-i ([-00, x]). 
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Convergence of types yields that a,/3 G ERVp^fc, and 

Gc-i([-oo,x]) = Gi([-oo,c^a; + VCc)]), 

with ip as in (2.2). Using the identity (2.3), we find that Gu has the form (4.3), with G = Gi. □ 

Starting from kernel convergence (4.1), Proposition 4.1 implies that a,/3 being ERV is necessary 
and sufficient for obtaining a CEVM. Unlike Section 3, here we need G to be non-degenerate in 
order to apply the convergence of types theorem. 

Theorem 4.1. Suppose {X,Y) is a random vector on M x [0,oo) and (3.1) holds. Assume (4.1) 
holds for non- degenerate limit distribution G on [— oo, oo) and scaling and centering functions 
a{t) > and (3{t) G M. As t ^ oo, 



(4.7) 



tP 



a{t) ' t ' 



M-)#o 



lI+([— oo, oo] X (0, oo]) 



where fi satisfies (2.10), if and only if a,l3 G ERVp^fc. In this case, fi is specified by 



(4.8) 



fj.{[-oo,x] X (y,oo]) 



ui{du)G{u^P{x - Tp{u))) , X G R, y > 0, 



with ip as in (2.2) and i'i{du) = u '^du, u > 0. Expression (4.8) is continuous in x and y if 
{p,k) 7^ (0,0), or if G is continuous. 

Proof. The convergence (4.7) to a limit fi satisfying (2.10) implies a,l3 G ERV [10, Proposition 1]. 
Conversely, if a,/3 G ERVp^^, then the convergence (4.7) follows from a variant of the continuous 
mapping theorem ([18, Lemma 8.4]) in light of (4.6), yielding the limit in (4.8). We check that 
^([—00,2;] X (y,oo]) is continuous when {p,k) 7^ (0,0) by applying dominated convergence: if 
— )• X, then 



G u-P{ 



^(u))) ^ G{u-p{x-^Ij{u))) 

for all except a countable number of u corresponding to discontinuities of the distribution function. 
Continuity in y is clear. Also, if {p,k) = (0,0), then //([— oo,x] x (j/,oo]) = y~^G{x), which is 
continuous if G is. In either case, fi{[— 00, x] x (y,oo]) is non-degenerate in x because G is non- 
degenerate. Finally, /i({oo} x (y,oo]) = y~^G{{oo}) = 0. Therefore, p satisfies (2.10). □ 

Changing variables u 1— )• 1/n in (4.8), the limit measure is 



(4.9) 
where 



p{[-oo,x] X (y,oo]) 



G{uPx + ipiu)) du, 



luP{x + kp~^) - kp^^ p/0 
I X -|- /c log u p = 

Changing variables, we obtain the following expressions for p, according to {p,k): 
(4.10) ^([-oo,x] X (y,oo]) = 



p\x + kp-^\^/p Jo 



U 



(1- 



'P^/PG{u sgn(x + kp-^) -kp-^)du p / 



1 



b-'G(x 



X sgn(fc) — |A;| log y 



e"/l'=lG(usgn(fc))dn 



0, /c / 
0, /c = 
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Here sgn(f) = v/\v\ l|^-^o}) ^'^d read the measure as y^^G{—kp^^) when x = —kp^^ for the 
case yO 7^ 0. Continuity in x and y when [p, k) ^ (0, 0) is apparent from the above expressions. 
We give an example where K satisfies (4.1), but (4.7) fails because a,/3 are not ERV. 

Example 4.1. Consider Y ~ Pareto(l) and U ~ Uniform(0, 1), independent of Y. Put X = Ue^ . 
Then 

K{y , [0, x]) = P[X <x\Y = y] = P[U < xe^^] = xe^^ A 1. 
Polynomial scaling is not strong enough to give an informative limit, since for any /? > 0, 

K{t , t^'iO, x\) = xffPe-^ A 1 ^ 0. 

The appropriate normalization is exponential {a{t), f3{t)) = (e*,0), which is not ERV: 

K{t, a{t)[0,x]) = xe*e"* A 1 .x A 1 = G{x), 

and up to asymptotic equivalence, by the convergence to types theorem, this the only normalization 
yielding a non-degenerate limit. Since {a{t), /3{t)) is not ERV, Theorem 4.1 claims that {X,Y) 
cannot follow a CEVM. To verify this ab initio, consider y > and t so large that ty > 1: 

/•oo 

tP[X < a{t)x,Y > ty] = tP[Ue^^ < e*x,Y > ty] = / n^^^xe-*^""^) A l)du 

Jy 

= [ u-2(itx{xe-*("-i) A 1} + l|j,<i} / tx-2dn{xe-*("-i) A 1}. 

J{CyVl),oo) J(y,l] 

The first integral in the previous sum is bounded by xy~^e~^^y~^^ — )• 0. If y < 1, the second 
integral approaches z^i(y, 1] = y~^ — 1. Therefore, the limit is degenerate in x, violating conditional 
non-degeneracy (2.10). This is not repaired by using ERV normalization since if a, /3 are ERV, then 

t P[X < a{t)x + j3{t),Y >ty]= / u'^duK [tu , [0, a{t)x + /3(t)]) 

J {y,oo) 

= I u~^du{e-^''{a{t)x + /3(t)) A l) ^ 0, 

which follows from the asymptotic properties of ERV functions (see Section 2.1 (p. 2)). □ 

4.2. Standardization of X. In certain cases, it is possible to standardize the X variable [5, 
Section 3.2]. 

4.2.1. Standardization functions. Denote by x* and x^, the upper and lower endpoints of the dis- 
tribution of X respectively, i.e., 

X* = sup{x : Fx{x) < 1} and x* = inf{x : Fx{x) > 0}. 

Call / : (0,oo) i— )• (a;^,,x*) a standardization function if / is monotone and lim.x^^f{x) = x* if / 
is non-decreasing and lima;_5.oo f{x) = x* if / is non-increasing. As in [5, Section 3], we standardize 
with such functions. For the purpose of this section, extend the definition of : (x*, x*) i— (0, oo) 
in order to invert right-continuous monotone functions which are either increasing or decreasing. 
Define 

{inf{y : f[y) > x} if / is non-decreasing 
inf{y : f{y) < x} if / is non-increasing 

Note that is left-continuous for / non-decreasing and right-continuous for / non-increasing. 
The main property we shall be using is that 

{/^(x) < y <;=^ X < f{y) f non-decreasing 

/^(x) < y <;=^ x> f{y) f non-increasing 



r 
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The distinction between the two cases is a technicahty which should not cause confusion in the 
fohowing discussion. Also, say that a monotone function / has two points of change if there exist 
xi < X2 < x^ such that f{xi) < f{x2) < /{xs) for / non-decreasing, and with the opposite 
inequahties in the non-increasing case. 

If the pair {X,Y) satisfies (4.7) for some a > and /3, then we say {X,Y) can be standardized 
if there exists a standardization function / and a non-nuU Radon measure fi* such that 

(4.11) tP[t-\riX),Y)G-]^fi*i-) in M+([0,oo] X (0,oo]). 

If the hmit fj, in (4.7) satisfies the conditional non-degeneracy conditions (2.10), then standardization 
is possible if and only if {p,k) ^ (0,0); i.e., fi is not a product. 

4.2.2. Characterizing standardization functions. What property does / need to be a standardization 
function satisfying (4.11)? 

Proposition 4.2. Suppose {X,Y) follow a CEVM so that (4.7) holds with n satisfying the condi- 
tional non- degeneracy conditions (2.10). Assume {p,k) ^ (0,0). A function f standardizes {X,Y) 
in the sense of (4.11) where fi* satisfies the conditional non- degeneracy conditions iff 

f(tx) - Bit) , , 

where (p has at least two points of change. In this case, fi and fi* are related by 

/i*([0,2;] X (y,oo]) = /u(A^(x) x (y,oo]), 

where 

{ [— oo, ip{x)] f non- decreasing 
[ip{x),oo] f non-increasing 



(4.13) A^{x 



It follows that a(-),/(-) G ERV, though not necessarily with the same parameters as a,/3. 
However, depending on the case, / can be expressed in terms of either (3 or a ([4, Proposition 
2.3.3]). 

Proof. Suppose / is non-decreasing. Then for x,y > 0, we can write 



(4.14) t P 



f^(X) Y 
^ ' <x,->y 



tP 



X - m ^ fjtx) - m Y^ 

a{t) ~ a{t) ' t 



t - ' t 
If / satisfies (4.12), then (4.11) holds with 

/i*([0,a;] X (y,oo]) = p{\-oo, ip{x)] x (y,oo]) 

non-degenerate in x. On the other hand, if (4.11) holds, then (4.14) implies (4.12), and (p has at 
least two points of increase because p* is non-degenerate in x. The mass at {oo} condition in (2.10) 
follows from the fact that Ivoix^oo fix) = oo if / is non-decreasing (see (4.15) below). The case for 
/ non- increasing is similar, after reversing the inequality for X on the right-hand side of (4.14). □ 

Assuming (4.12) and a,/3 G ERV^^p, write 

fjtx) - Pit) ^ aitx)fitx)-Pitx) ^ Pitx)-Pit) 

ait) ait) aitx) a(t) 
and with c = (/?(1), if has the form 

icxP + kp-\xP -I) p/0 
(4.15) 'fix) = i 

I c + fe log X p = U 

If if has two points of change, we get the constraint that c 7^ if p 7^ 0, A; = 0. 
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4.2.3. Kernel convergence and standardization. Assuming a standardization function exists and 
that the conditional distribution of X given Y satisfies kernel convergence assumption (4.1), we 
can standardize directly through the Markov kernel. We consider the new kernel Kf{y,A) = 
K{y,f{A)) =: P[/'*~(X) € A\Y = y\. The next result may be compared to the formulation in [4, 
Proposition 2.3.3] for joint distributions rather than Markov kernels. 

Proposition 4.3. Suppose the transition function K : (0, oo) x oo,oo] — t- [0,1] satisfies (4.1) 
for a probability distribution G on [—00,00). // / is a monotone function satisfying (4.12), then 
the transition function Kf : (0, 00) x i3[0, 00] — t- [0, 1] defined as 

Kf{y,A) = K{y,f{A)) 

satisfies (3.2), 

Kf{t, t[0,x]) ^ G{A^{x)) =: G/([0,x]) on [0,oo], 

with Aip{x) as in (4.13). Conversely, if we start with a kernel K{y,-) satisfying (3.2), for limit 
probability measure G on [0, 00), then given ERV functions a > 0, /3 € M on (0, 00), if f is monotone 
on (0,00) satisfying (4.12), the transition function Kf : (0,oo) x ;B[— 00,00] 1— )■ [0,1] given by 

Kj{y,A) = Kiy,f^{A)) 

satisfies (4.1), 

Kf{t, [-oo,a(t)x + /3(i)]) ^ G{A^^{x)) =:Gf{[-(X),x]) on /([0,oo]), 

where 

[0,ip^{x)] f non- decreasing 
[(/?^(x),oo] f non-increasing 

Proof. Assume (4.1) and / is a non-decreasing function satisfying (4.12). Then, 
Kf{t,t[0,x]) =K{t, [-00, /(te)]) 

fitx)-l3ity 



A^^ (x) 



Kit, a{t) 



00, 



+ m] ^G{[-oo,ip{x)]). 



a{t) 

Conversely, if / satisfies (4.12) for a, f3 £ ERV, then inverting (4.12) yields 

f^{a{t)x + p{t))/t if^ix), X e /((O, 00)). 

Consequently, 

Kf{t, [-oo,a{t)x + (3{t)]) =K{t,t[0,t-'f^{a{t)x + Pm) ^G{[0,^^{x)]). 
The case for non- increasing / is similar. □ 

If is a version of the conditional distribution P[X G • | y = y] satisfying (4.1), where a, /3 S 
ERVp^fe with (yO, k) 7^ (0, 0) and Fy is in the standardized domain of attraction, then (X, Y) follows 
a CEVM by Theorem 4.1. Furthermore, {X,Y) can be standardized in the sense of (4.11) [4, 
Proposition 2.3.3 (1)], and the standardization function / satisfies (4.12) by Proposition 4.2. 

4.2.4. Moment restrictions. Section 3.1 considered the standard case and found, in particular, 
that if X belongs to the standardized domain of attraction, < < 1 where ^ ~ G. When 
standardization is possible, a comparable moment restriction occurs provided X has a distribution 
in a domain of attraction. 

Assume there exist normalizing functions c{t) > and d{t) E M such that 

'X - d{t) 



(4.16) tP 



, , > X 
c{t) 
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implying that c, d G ERVa,i (see Section 2.2, p. 3). If {X, Y) follow a CEVM and (4.16) holds, then 
the vector {X, Y) belongs to a multivariate domain of attraction provided limt_s.oo a{t) / c{t) G [0, oo) 
[5, Proposition 4.1]. Continuing the theme of assuming kernel convergence, consider the case where 
(4.1) holds under the same normalization as in (4.16): 



Theorem 3.1 gives a standard CEVM for {d^{X),Y), and furthermore, d^{X) belongs to the 
standardized domain of attraction. Therefore, the distribution G must satisfy 



Thus, we obtain a different condition for each class of extreme value distribution. In the Frechet 
case, we have a bound on the 1/A-th moment of the right tail. If the domain of attraction is 
Weibull, this becomes an integrability condition near 0. Finally, in the Gumbel case, the right tail 
of ^ is exponentially bounded, so all right-tail moments exist. 

4.3. Relation to the HefTernan and Tawn Model. The CEVM of Theorem 4.1 is inspired by 
the work of Heffernan and Tawn [11]. Where Heffernan and Tawn's model is based on the con- 
vergence of conditional distributions as in (4.1), the general CEVM defined in (2.9), (2.10) focuses 
on limits of joint distributions. Our Theorem 4.1 shows that Heffernan and Tawn's assumption 
[11, Equation (3.1)] leads to a CEVM provided their normalization functions a and /3 are ERV. 
The fact that they require convergence (4.1) hold at all points x suggests that they are expecting 
continuous limits whereas we framed the assumption as weak convergence. 

A condition such as (4.1) tacitly assumes a particular version of the conditional distribution. The 
issue of version cannot be ignored, since Example 3.7 shows that (4.1) holding for one particular 
version does not imply that it holds for every version. The issue of version is usually handled by 
smoothness assumptions. 

For a non-degenerate CEVM, the functions a and /? are necessarily ERV. Heffernan and Tawn 
assume a parametric form for these functions. They specify 



K{t, [-00, c{t)x + d{t)]) ^G{x). 
Then from (2.7), p. 3, d is a standardization function satisfying (4.12), and 




A / 
A = 




Depending on A, this reduces to 




A > 
< |A|VIA| A<0. 

A = 



a(y) = b\i{y) ■= y^^' = 



for some constant /? < 1 and 




c — d log y 



< p< 1, with a G [0, 1] 

p < with a = 0, c G M, d G [0, 1] 



Although more general models are possible, the form of the ERV limit function ij) in (2.2) (p. 2) 
suggests that a parametric approach is indeed reasonable. 
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5. General Normalizations for both X and Y 

So far we assumed that Y satisfies t P[y > ty] — )• for y > 0. We now extend Theorem 4.1 to 
the case where Y belongs to a general domain of attraction: 

(5.1) tP[Y >a{t)y + h{t)]^{l + ^y)-^h y G E, 



-7) 



and := {y : 1 + 7y > 0}. Assume b{t) is given by (2.6). Without change, (4.1) may no longer be 
sufficient to obtain a general CEVM limit (2.9) if Y requires normalization according to a and h. 

To relate kernel convergence to the CEVM when (5.1) is the hypothesis, there are two ways to 
proceed: (i) Assume K {^a{t)u + b{t) , [—oo, a{t)x + P{ty\) — ipx{u) for u > 0, and then (2.9) should 
follow from arguments similar to those in Section 4.1. (ii) Standardize Y via the transformation 
Y b^(Y), use a version of P[X G • | b^{Y) = y] =: K*{y, •), and rely on (4.1) for K*. We show 
the consistency of these two approaches. 

5.1. Kernel Asymptotics. The transition function K : (— oo, oo) x B[—oo, oo] i— )• [0, 1] is a specific 
version of the conditional distribution of X given Y, K(y , ■) = P[X G • 1 1^ = y]. To consider (ii) 
above, we first express a version the conditional distribution of X given b^{Y) in terms of K. 

When the distribution of Y is not in the Prechet domain of attraction, the convergence (5.1), 
where b is given by (2.6), implies that a,b £ ERV^^i for some 7 G M. Hence, a G RV^, and 

,5.2) mzm^l^-^ ^-^V ,>„^ 

^ ^ [ log X 7 = 

Inverting (5.2) gives 

t-(,Wx + Kt))^|(l + 7x)V- 7#0 
^ ' t [e^' 7 = ^ 

Furthermore, if b* is any function on (0, 00) satisfying 

(5.4) {b*{t) -b{t))/a{t) — >0 as t ^ 00, 

then (5.1), (5.2), and (5.3) hold with b replaced by 6*. A standard technique is to choose a smooth, 
strictly monotone b* as is summarized next (cf. [21]). 

Lemma 5.1. There exists a function b* satisfying (5.4) that is continuous and strictly monotone. 

Proof. Consider cases: If 7 = 0, then b G n(a) and there exists [16, Proposition 0.16] b contin- 
uous, strictly increasing such that (b(t) — b{t))/a{t) — )• 1. The choice b*{x) = b{e~^x) satisfies 
(5.4). If 7 > 0, then b G RV^, and b{t)/a{t) 7"^ [7, Theorem B.2.2 (1)]. Consequently, 
[15, Proposition 2.6 (vii)] gives a continuous, strictly increasing function b* ~ b. Writing 



b*{t)-b{t) b{t) 



b{t) 



a{t) a{t) 

shows that b* satisfies (5.4). Pinally, if 7 < 0, then 6(00) = lim(_>.oo b{t) exists finite, 6(00) — 6 G RV^, 
and (fe(oo) — b{t))/a{t) — )• — 7"""^. Choose b continuous, strictly decreasing, with b ~ ib{oo) — b), and 
set b* = 6(00) -b. □ 

It is easier to deal with b* rather than b since b*'^ {b* (x)) = 6*(6*^(x)) = x but b* still standard- 
izes Y. By (5.2), Y* = b*'^(Y) is in the standard domain of attraction when (5.1) holds: 



tP[Y* >ty]=tP 



Y-b*{t) b*{ty)-b*{t) 
a{t) ^ ait) 



y \ y>o. 



16 S. I. RESNICK AND D. ZEBER 

Furthermore if K{y, ■) = P[X G ■ \ Y = y], 

(5.5) K*{y,.):=K{b*{y),-) 

is a version of the conditional distribution P [X € • | y * = y] . This follows from 



(5.6) P[X eA,Y* >y]= j K{h*{u) , A) P[Y* £ du], measurable A, y > 0. 

J {y,oo) 

To see this write, 

P[X G A,Y* > y] = P[X £ A,Y > b*{y)] = [ K{u, A) P[Y e du] 

J(b*(v).oo) 



L 



l{b*{y),co) 

K{b*{b*^{u)) , A) P[Y G du], 



{b*(y),oo) 

using the fact that b*{b*^ {u)) = u for all u. Finish with a change variables to get (5.6). 

We now show that the two approaches to the CEVM discussed at the beginning of Section 5, 
the direct approach and the standardization approach, are consistent. 

Proposition 5.1. Suppose Y has a distribution satisfying (5.1) and K* is given by (5.5). Given 
normalization functions a{t) > and f3{t) G M, there exists a transition function (jf : (0, oo) x 
B[—oo, oo] I— 7- [0, 1] such that, as t —t- oo, 

(5.7) K*[tut, [-oo,a{t)x + I3{t)]) ^(l)*{u, [-oo,x]) on [— oo, oo] 

whenever Ut ^ u £ (0, oo), iff there is a transition function : x B[—oo, oo] i— t- [0, 1] such that, 
as t ^ oo, 

(5.8) K (^a{t)ut + b{t) , [— oo, a(t)a; + /3(f)]) =^ (j){u , [— oo,a;]) on [—00,00] 

whenever Uf — )• u G E^ . // these convergences hold, then 

(i) a, /? G ERV; 

(ii) (p* = Kg* , a generalized tail kernel (4.3) with G* = (p*{l,-); 

(iii) 4>{u , A) = kg{{1 + ju)^/^ , A), where kq is a generalized tail kernel with G = i;^(0, •); and 

(iv) the two transition functions are related by G = G* . 

Proof. Abbreviate at = a{t) and bt = b{t). The convergences (5.2) and (5.3) are locally uniform 
on (0,00) (see Section 2.1). Since b* satisfies (5.4), it follows that 

b*{tut)-bt ^ u^-l , ^ ^rn ^ 
>• whenever ut ^ u G (L),ooj, 

at 1 

and 



^ *^ * — > (1 + '^u)^^"' whenever — )• n G E^. 

Assuming (5.7), for — u G E^ we have 

K[a{t)ut + b{t) , [-00, a{t)x + /3(t)]) 

= K{b* {t{t~^b*^ {atut + bt)}) , [-00, a{t)x + /3(f)] 
= K* {t{t-H*^ {atut + bt)} , [-00, a{t)x + p{t)]) 
(P* ((1 + 7n)i/^ , [-00, x]) =: ^{u , [-00, x]) 
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Conversely, if (5.8) holds, then for ut ^ u > 0, 

K*{tut, [-^,a{t)x + (3{t)]) 

= K{af at\b*{tut) - ht) + h , [-<x,a{t)x + /3(t)]) 
(kil^^iu' - 1) , [-oo,a;]) =: 4)*{u, [-00,3;]) 

In either case, G := (p{0,-) = (f)*{l,-) =: G* . Proposition 4.1 shows that a and (3 are ERV and 
4)* = Kg*- Consequently, 4>{u , ■) = kg((1 + ^uY^'^ , •)• ^ 

Therefore, by Proposition 4.1 (p. 9), if there exists a non-degenerate distribution G on [— oo,cxd) 
such that 

(5.9) K* [t , [-00, a{t)x + p{t)]) = K[h*{t) , [-00, a{t)x + /3(t)]) G{x) 

with a, /3 G ERV, then (5.8) holds. 

How can we apply Proposition 5.1 starting from an assumption like (5.9) on the kernel K rather 
than K*l Because h* [h**^ [t)) = t, (5.9) can be written as 

K{t, [-oo,aob*'~{t)x + l3ob*^{t)]) ^Gix) as t ^ y* , 

where y* denotes the upper endpoint of the distribution of Y, written as y* = sup{y : Fyiy) < 1}. 
Therefore, we require there to exist a non-degenerate distribution G and normalization functions 
a > and f3 such that 

(5.10) K{t, [-oo,a{t)x + ^{t)]) ^G{x) as t ^ y* , 
and a = a o 6*, /3 = ^ o 6* G ERV. 



5.2. CEVM Properties. The standardization approach given in the previous section yields a 
CEVM when Y belongs to a general domain of attraction. 

Theorem 5.1. Suppose {X,Y) is a random vector on M?, where Fy G D{Gj) according to (5.1) 
and K{y, •) = P[X G -11^ = ^] satisfies (5.10) for normalizing functions d > and /3 G M and 
non- degenerate limit distribution G on [— oo,cx)). Let b* be the function satisfying (5.4) given by 
Lemma 5.1 and put a = a o b* , /3 = (3 ob* . Then, as t ^ 00, 

fi{-) in M+([-cxD,oo] X E^), 

where fi is a non-null Radon measure satisfying the conditional non- degeneracy conditions (2.10), 
iff a, 13 G ERVp^fc. In this case, the limit measure fj, is specified by 

(5.12) ^([-00,3;] X (?/,oo]) = / Glu^x + '4}{u))du, 2; G M, y G E^, 

with tp as in (2.2). The expression (5.12) is continuous in x and y if {p,k) 7^ (0,0). 

Proof. First, observe that Y* = b*'^{Y) G D(G^). Defining the transition function K*{y,-) = 
P[X G • I y* = y] as in (5.5), our hypotheses imply (5.9). Therefore, if a,/3 G ERVp^fc, then by 
Theorem 4.1, we have 

^*(-) in M+([-oo,oo] X (0,oo]), 



(5.11) 



tP 



X-I3{t) Y-b{t) 



a{t) 



a{t) 



tP 



X - I3{t) Y* 
a(t) ' T 
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where fi* is fi*{[—oo,x] x {y,oo]) = Jq G{uPx + ip{u))du, x G M, y > 0, and is conditionally 
non-degenerate. Consequently, for x G M and y G E^, 



tP 



X - m . Y-h{t) ■ 

< X, 7-^ — > y 



a{t) 



tP 



X - m Y* b*^{a{t)y + b{t)) 
a{t) - ' t t 



(l+^j;)-l/7 



G{u'^x + ^(ti)) du = /u([— oo, x] X (y, oo]), 



and the marginal transformation of Y does not affect conditional non-degeneracy or continuity. 
Conversely, (5.11) implies that a,P £ ERV [10, Proposition 1]. □ 

Alternatively, instead of standardizing Y, we could equally have used the convergence (5.8), 
which holds under our assumptions by Propositions 4.1 and 5.1. 

Recalling the forms of the limit measure given in Section 4.1, we can express the limit measure 
in (5.12) as 



fi{[-oo,x] X (y,oo]) 



1 



\x+kp-^\{l+'yy)-P/^ 



p\x + kp-^\^/P 

I /■a:sgn(fc)-|fc|7-l log(l+7y) 



^^-p'^/PG{usgn{x + kp-^) - kp^^)du 



;"/l'=lG(nsgn(A;))(in 



p^O 
p = 0, fc/O 



^(l + ^y)-i/7G(x) p = o^k = 

where sgn(t;) = v/\v\ l|t,^o}) ^iid we read the measure as (1 -|- 'yy)~^^'^G{—kp~^) when x = —kp~^ 
for the case /? / 0. 

In Example 4.1 (p. 11), we presented a transition function satisfying (4.1) which did not lead 
to a CEVM when paired with Y G D{G\). We now show that a non-degenerate CEVM may be 
obtained if Y belongs to a non-standardized domain of attraction. 

Example 5.1. Consider Y ~ Exp(l), and U ~ Uniform(0, 1), independent of Y. Put X = Ue^ . 
Note that Y G D{Gq) with a{t) = 1, b{t) = logt, since for y G M, 

tP{Y>y + log t) = = e-y. 

A version of the conditional distribution is given by 

K{y , [0, x]) = P[X <x\Y = y] = P[U < xe'^] = xe'^ A 1. 

Taking a{t) = e*, we saw in Example 4.1 that 

K{t, a{t)[0,x]) ^ X A 1 = G(x), 

although a is not regularly varying. Since b is continuous and strictly monotone, set a{t) = 
a{b{t)) = t. Then 

K*{t, t[0,x]) = K{b{t), a{b{t))[0,x]) ^ G(x), 

anda(i) G RVi. Hence, K*{tu, a{t)[0,x]) xu-^Al = G{u~^x), and K*{y,-) = P[X g • = y]. 
On the other hand, note that for ti G M, 

K{a{t)u + b{t) , a(t)[0,x]) = txe-"-'°§* A 1 = xe"" A 1 = G((e")-i[0, x]) . 
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This illustrates the equivalence presented in Proposition 5.1 (p. 16). Now, for x > 0, y > 0, the 
joint distribution is given by 

flog X 



poo piogx 

P[X <x,Y > y]= / xe-^'^du + / e'^'du 1 



og xyy 

Therefore, for x > 0, y G M, and large t, we have 

xe~'^y 



{y<loga;} 



if logx < y 



tP[X <tx,Y >y + \ogt\ = I 2 ^ V =/i([0,x] X (y,oo]), 



e ^ if logx > y 

2x 

and {X,Y) follow a CEVM by Theorem 5.1. 

6. Conclusions and Future Directions 

In many statistical contexts a conditional formulation such as (4.1) is convenient. An example 
is when we model X as an explicit function of Y or when we work with distributions that have 
continuous densities, in which case the natural choice of version of the conditional distribution is the 
absolutely continuous one. In such cases, Heffernan and Tawn [11] approach is natural and leads to 
a parsimonious extremal model which can account for varying degrees of asymptotic independence. 
Heffernan and Tawn propose a semiparametric model, where the limit distribution G is estimated 
nonparametrically, and the normalization functions a and /3 belong to a parametric family. The 
extended regular variation of a and /3 provides justification for the form of the parametric family. 
The formulas for the limit measure derived in our present paper show assuming conditional dis- 
tributions leads to a simpler CEV model parametrized by the distribution G and the pair {p,k), 
along with 7, the extreme value index of Y. 

Fitting a bivariate CEV model has been considered in [6, 9]. These authors discuss statistics for 
detecting a CEV model and estimating the normalizing functions. However, open questions remain, 
such as the asymptotic distributions of estimators, and the appropriate method for nonparametric 
estimation of G. 

A natural extension of the bivariate model is to higher-dimensional vectors and this was the 
original intention of Heffernan and Tawn, who apply their methodology to a five-dimensional air 
pollution dataset. It is not clear how to condition on more than one extreme variable; presumably 
there are connections between such a model and the usual multivariate domain of attraction model. 
Cases where asymptotic independence is present between some pairs of variables but not others 
requires careful treatment and how to proceed in high dimensions is not apparent. 
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